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1.  Introduction 


The  minimum  variance  linear  estimates  of  values  of  a  variable  s  (V> ,  A ,  r) 
from  a  finite  set  of  measurements  consisting  of  samples  of  a  signal  z  (tp,  A  ,  r) 
plus  measurement  noise  n(P,A,  r),  can  be  obtained  using  the  following  formulas 
(Moritz,  1972): 

s  =  Fd 

F  =  C„  Cm  (1. 1) 

Cdd  =  C,J  +  D 

The  variance -covariance  matrix  of  the  estimation  errors  is 

E  =  M[(s  -  Fd)  (s  -  Fd)T } 

=  C..  -  FCjj-  C,,Ft  +  FCdd  Ft 
or 

E  =  C„  -  C.jCdd1  C[t 

in  the  case  of  the  optimal  estimator  F  as  given  in  (1. 1)  above.  Here 

M  f  ]  is  some  kind  of  average  over  the  sphere, 

^  is  the  N,  vector  of  estimates, 
d  is  the  Nd  data  vector, 

CiX  is  the  N,  x  Nd  covariance  matrix  of  true  values  of  s  and  z, 

Cdd  is  the  Nd  x  Nd  data  covariance  matrix, 

C„  is  the  NdxNd  measurements' signal  covariance  matrix, 

D  is  the  Nd  x  Nd  measurements'  noise  variance-covariance  matrix, 

C„  is  the  N«  x  N«  covariance  matrix  of  the  true  values  of  s. 

All  variables  are  supposed  to  have  zero  mean  values:  M  f  s]  =  M  [  z  J  =  M{n}  = 

M  {1}  =0,  while  the  noise  is  assumed  to  be  uncorrelated  both  with  the  signal 
s(«p,A,r)  and  with  The  estimates  s^  obtained  by  using  the  estimator 

F  defined  in  (1.1)  have  a  corresponding  E  matrix  whose  diagonal  elements  are 
the  smallest  for  all  possible  linear  estimators  with  the  same  data  pattern.  In 
this  sense,  F  is  optimal.  This  estimator  depends  on  the  particular  M  {  }  chosen, 
as  explained  by  Rummel  and  Schwarz  (1977),  because  this  affects  the  elements  of 
the  covariance  matrices.  In  turn,  these  influence  the  actual  estimates  s,  and 
E  as  well. 

A  major  problem  with  this  method,  also  known  as  collocation,  is  that  the 
number  of  measurements  Nd  is  also  the  dimension  of  Ca  ,  so  setting  up  this 
matrix  requires  computing  up  to  ^Nd3  different  elements.  These  elements  are 
given  by  the  expression 

CdV  =  M{z(<Pi,At,ri)  z(<Pj,X  3,rj)]  +  M{n(tps,X1(  r,)n(<pj,Xj,  rj)} 

=  c„(P:,Pj)  +  cnn(P„Pj)  (1.3) 

where  c,t  is  the  "covariance  function"  of  the  signal  z,  and  c„n  is  that  of  the 


(1.2-a) 

(1.2-b) 


noise  n.  Both  depend,  generally,  on  the  sampling  points  P;  and  Pj  .  Inmost 
cases 

o,.(P,.P,)-  {o'£hi™L  I1-4' 

which  is  what  is  meant  by  the  words  "white  noise".  Calculating  the  covariance 
functions' values  may  involve  several  operations  in  a  computer.  Sometimes  a 
large  number  of  terms  of  a  series  expansion  may  have  to  be  evaluated;  even  with 
closed  expressions  this  can  be  a  time  consuming  enterprise.  Moreover,  solving 
the  "normal"  equations 

Cdd  Ft  =  C*x  (1.5) 

to  find  F  requires  an  additional  kN3  operations,  k  being  a  constant  character¬ 
istic  of  the  method  used.  Both  tasks  can  be  formidable  with  large  numbers  of 
measurements,  presenting  the  paradox  that,  while  more  data  must  result  (theo¬ 
retically)  in  better  estimates,  these  are  harder  to  obtain  and  are  worse  affected 
by  numerical  errors.  The  discussion  that  follows  will  show  how,  for  certain 
types  of  covariances  and  certain  distributions  of  data,  the  problem  becomes 
manageable  even  with  large  data  sets. 


2.  Limitations  on  the  Data  Arrangement  and  on  the  Covariance  Functions 

Geodetic  data,  such  as  gravity  anomalies,  geoid  undulations,  etc. ,  are  given 
usually  in  the  form  of  either  point  values  or  of  area  means.  In  each  case  let 
the  following  conditions  be  satisfied; 


Point  Values 

C-l  AH  data  points  are  on  the  same 
sphere  of  radius  R  ; 

C-2  AH  data  points  are  nodes  in  a 

grid  of  "parallels  and  meridians" 
(Fig.  2.1)  with  poles  excluded; 

C-3  No  data  point  in  a  row  (all  nodes 
along  the  same  parallel)  is  empty; 


Area  Means 

C-l'  All  area  averages  are  taken 

over  blocks  on  the  same  sphere 
of  radius  R ; 

C-2'  All  blocks  belong  to  a  "parallels 
and  meridians"  partition  of  the 
sphere  without  circular  blocks 
(polar  caps)  about  the  poles; 

C-3'  In  a  row  of  blocks  (blocks  bound 
by  the  same  parallels)  if  there  is 
data  in  one,  there  is  data  in  all; 


C-4  The  longitude  increment  between 
adjacent  meridians  is  constant. 


C-4'  All  blocks  have  the  same  longi¬ 
tude  span. 


Figure  2. 1-a.  Example  of  a  grid  of  Figure  2. 1-b.  Grid  of  area  means, 

point  measurements  seen  from  one  of  The  shaded  blocks  contain  data.  The 

the  poles.  Notice  that  the  Pole  itself  "parallels"  and  "meridians"  delimiting 

is  excluded,  and  that  the  separation  the  blocks  are  the  same  as  in  Figure 

of  the  "meridians"  is  constant,  not  1.1-a.  Notice  the  absence  of  an  un- 

so  that  of  the  "parallels".  divided  "polar  cap". 

In  what  follows,  data  points  along  the  same  parallel  (blocks  between  the  same 
bounding  parallels)  form  a  row,  while  those  along  (between)  the  same  (bounding) 
meridian(s)  form  a  column.  Rows  are  numbered  from  N  to  S ,  and  columns  from 
W  to  E.  The  latitude  increments  between  data  points  (span  of  the  blocks  in  latitude) 
do  not  have  to  be  constant,  but  the  separation  in  longitude  (longitude  span  of  the 
blocks)  has  to  be  constant.  Moreover,  rows  of  blocks  with  data  can  be  separated 
by  empty  ones. 

All  rows  must  have  the  same  number  of  points  (blocks),  equal  to  the  number 
of  columns  N- ,  which  is  also  the  number  of  meridians.  A  meridian  is  a  180°  arc 
from  pole  to  pole.  Calling  the  number  of  rows  containing  data  Nr  ,  then  the  total 
number  of  non-empty  points  (blocks)  in  the  grid  is 

N<  =  Nc  X  Nr 

This  is  the  dimens  ion  of  d  and  also  that  of  C4d,  or  the  number  of  values  in  the  data  set. 

Having  explained  the  kind  of  grid  admissible,1  something  must  be  said  about 
the  covariance  functions  (or  the  M{  }  operator)  that  can  be  used.  They  must 
satisfy  the  following  restrictions: 

1  In  general,  the  data  can  be  irregularly  distributed,  but  point  values  can  be 
estimated  at  the  nodes  of  the  grid  from  neighboring  measurements,  and  area 
means  from  the  averages  of  measurements  on  the  same  blocks.  The  estimates' 
variances  should  be  found,  also,  to  set  up  D  . 


Point  Values 


Area  Means 


C-5  Given  two  rows  i  and  j 
(including  i  =  j ),  the  value 
of  the  covariance  function 
C„(P».PjJ  =  MfxlkxjB} 
between  points  Plk  and  Pjs 
must  depend  only  on  |Xk  -X,| 
(including  k  =  m). 


C-5'  Given  two  rows  of  blocks,  i 
and  j  (including  L  =  j  > ,  the 
covariance  between  two  area 
means  lc,k  and  x^,  must 
depend  only  on  |X  !k  -  X 
(including  k=m),  where  Xik 
is  the  longitude  of  the  W 
boundary  of  block  ik  . 


Restriction  C-5  (C-5')  allows  a  variety  of  covariance  functions,  including  the 
so-called  "isotropic"  or  "global”  used  by  the  majority  of  workers  at  present 
(Rummel  and  Schwarz,  1977). 


Usually  the  "noise"  function  n  is  supposed  to  be  uncorrelated  from  meas¬ 
urement  to  measurement  ("white  noise")  resulting  in  a  diagonal  D  matrix.  To 
satisfy  C-5  (or  C-5')  the  variance  of  the  errors  must  be  the  same  at  all  points 
(blocks)  on  the  same  row,  while  it  can  vary  from  row  to  row.  This  assumption 
is  rather  restrictive,  as  in  practice  (particularly  with  area  means,  where  the 
size  of  the  block  changes  with  latitude)  the  standard  deviation  of  the  measurements 
can  vary  from  place  to  place.  Nearly  homogeneous  data  sets  may  become  more 
common  through  an  increase  in  the  measuring  of  the  gravity  field  from  satellites 
(satellite  altimetry,  satellite-satellite  tracking,  etc.  ).  In  some  cases,  even 
when  C-5  or  C-5'  are  not  exactly  fulfilled,  the  noise  fluctuations  along  rows 
might  be  small,  and  the  mean  standard  deviation  of  each  row  could  be  used  to  set 
up  D.  A  refinement  of  this  idea  is  explained  in  the  Appendix. 


3.  The  Structure  of  the  Data  Covariance  Matrix 


When  all  the  limitations  described  in  the  previous  section  are  present, 
matrices  C„  ,  D  and  CdJ  =  C„  +  D  all  have  the  same  well-defined  structure. 

To  make  it  clear,  let  us  arrange  the  measurements  in  d  as  follow's1  : 

d  =  [d,  d„.  ..  d,  ...&,  ] r 
—  — —  —r 

w  he  re  d[  [djgd^.  *  •  d^k.  .  ,  d^^  ^  ] 

so  the  dt  are  Nr  subvectors  of  dimension  Nc  ,  each  containing  the  data  values 
for  one  row.  This  brings  about  the  corresponding  partitioning  of  Cdd  ,  C„  and 
D  into  Nr“  Nc  x  Nc  "row  submatrices"  CiJf  containing  all  the  correlations 
between  points  (blocks)  in  the  ith  and  the  j  th  row.  Assume  N  =  5,  that  the 
point  on  the  ”0  th  meridian"  is  the  first  point  in  any  row,  and  that  all  others  in 
the  same  row  are  ordered,  like  their  meridians,  clockwise  when  seen  from  the 


To  simplify  typing  ( d,  , 


d  1  is  used  throughout  this  work  instead  of 


-4- 


i 


North  Pole  (i.e.,  along  increasing  longitudes).  Then,  the  point  immediately  to 
the  West  of  the  first  is  the  last,  or  Ne  th,  in  any  row.  From  the  various  re¬ 
strictions  mentioned  in  the  previous  section,  both  for  points  and  blocks,  it 
follows  that  if 

a  =  c„(x,o,Xj0) 

b  =  C„(Xl0,Xn)  =  CxxlXto.Xu) 
c  =  C„(Xl0,Xja)  =  C„(Xl0,Xj3) 


are  the  covariances  between  the  first  point  (block)  in  row  i  and  all  the  points 
(blocks)  in  row  j  ,  then 


CU  “ 


a  b  c  c  b 
b  a  b  c  c 
c  b  a  b  c 
c  c  b  a  b 
b  c  c  b  a 


Calling  p  the  row  subscript  and  q  the  column  subscript  of  the  cp'qJ  elements 
of  this  matrix,  we  notice  that  they  have  the  following  properties: 

c  pq  —  Cpi  i  q+x ;  cpl  —  cp_1No  when  p  >  1  ; 
c Pq  =  cp N0+a-q  When  p  =  l,  q>  1 


Matrices  of  this  type  belong  to  the  class  known  as  "circular"  or  "Toeplitz  cir- 
culant"  (Lancaster,  1969).  As  C<i4 ,  C**  and  D  consist  of  blocks  of  this  kind, 
they  can  be  described  as  block  matrices  of  Toeplitz  circulant  blocks.  As  shown 
in  the  remainder  of  this  work,  these  matrices  are  much  easier  to  set  up  and  to 
invert  than  ordinary  symmetric  matrices. 


The  first  row  in  C  u  resembles  a  succession  of  equispaced  samples  of 
some  even  function,  so  it  can  be  represented  exactly  using  a  finite  sum  of 
cosines: 


=k|0a”COS  Ilf  q  l3-1) 

where  2  N  =  Ne  if  Nc  is  even;  2N  +  1  =  N0  if  Nc  is  odd,  and 

a*3  =  4  £  «£/  cos  q  where  H  is  defined  later.  (3.2) 
The  pth  row  is  the  same  as  the  first  rotated  p  times  to  the  right; 

CpV  *•  k£0aJJcos  (q-p) 


.u, 


2nk 


Nc 


N 

=  V 


a^]cos  ^  —  q  cos  ?n Kp  +  ^  a,'J  sin^J-^-q  siniiZii  p  (3.  3) 


v  \_\jo  — n —  u  tus  - IJ  T  'a,,  ^  Ui  VT  v.  - - 

„  =  0  *  Ne  Nc 


From  the  well-known  trigonometric  expressions 
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1 

H 

I 


8 


1 


V, 


-6- 


For  a  given  n  and  k  ,  is  in  a  one-to-one  relationship  with  the  Nr  -vector 
0  ^ k)  : 

g  (k)  =  [<pj  <02*  . . .  . . .  0Nr  ] T  ^=?>  f_g  =  .  .<a£(|£)  ]  (3. 10-a) 

and  similarly  .  . 

1  < k )  =  [ •y1k  yak...  y,k...yNrk]  ^  =  [*(§£)... yNkr(§J) 


Expression  (3.9)  can  be  given  matrix  form: 

Z  (k)  =  R(k)£  (k)4“^  C4d  f.o; 
where  R(k)  is  a  Nr  x  Nr  matrix  with  elements 


rU 


*  =  ak1J  H  = 


(3. 10-b) 

(3.11) 

(3. 12) 


Expressions  (3. 10-a),  (3. 10-b),  (3.11)  and  (3.12)  describe  a  property  of  the 
covariance  matrix  that  is  basic  to  the  algorithm  developed  in  the  next  two  sections. 


3.2  The  Equation  £  =  Cdd  x 

k  k 

Consider  once  more  the  vectors  _fa  and  ga  presented  in  the  previous 
section,  and  the  equation 

C  =  <**  f ;  (3  13) 

9 

k  k 

where  the  components  of  are  the  unknowns.  As  already  explained,  _f&  and 
are  in  one-to-one  relationship  to  g  (k)  and  £  (k),  respectively,  so  (3. 13) 
can  be  regarded  as  equivalent  to  (3. 11),  because  once  we  know 


<pZr]J  =  <0  (k)  =  R(k)'1^  (k) 


we  know 


the  solution  to  (3. 13).  If  Cm  exists,  there  is  always  a  solution  to  (3. 13),  and  thus 
to  (3. 11).  Therefore,  if  exists,  so  does  R(k)-1  for  all  0  *  k  <  N .  While 
the  strict  invertibility  of  the  R(k)  is  not  essential,  as  long  as  is  in  the 
range  of  Qa  ,  the  existence  of  Inverses  simplifies  the  argument.  Assuming  that 
Cla  exists  and  that  £  is  an  arbitrary  ht  Nr-  vector,  then  there  is  another 
N,.  Nr  -vector  x  such  that 


1  =  Qd  ,x  (3. 14) 

With  the  usual  partition  by  data  "rows” : 
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X  =  [Xi  X,.  .  .  XI  .  .  .  3^p  ]  T 
Xi  =  [  Xio  Xjj  .  .  .  Xjj  .  .  •  Xjmc_  j  1 


X  =  Ik  &>•••  Xf‘  >rlT 
Xi  =  [yioyu.  •.yij-..y*i«_ir 


A  sequence  of  Nr  numbers,  such  as  the  elements  of  Xj  or  ,  can  be  repre¬ 
sented  exactly  by  a  sum  of  sine  and  cosine  terms: 


2nk  T  _ 

.  2rrk 

H* 

=  )  C11(  COS 

kS'O 

~N~q  +  '  8tt 
Wr  kan 

sin— q 

(3.15) 

is  as 

in  (3. 1),  and 

clN  =  0  if  Ne 

is  even.  In  matrix  form: 

xM 

N 

~  k=0  Clk  -k  + 

N 

^E^Sik  s_k 

(3. 16-a) 

M 

N 

y* 

=  E  m,k  ck  + 

k  -*0  ~ 

r  Du  Sj, 
k  =  l 

(3.16-b) 

similarly 


Consequently,  vectors  such  as  x  and  y  can  be  represented  by  sums  of  vectors 

of  the  same  form  as  fk  or  above: 

-a  -oa 
v  i  k 

*  =  k£oa5o*a  (3.17-a) 

m  i 

1  ~  aS0  Xoe  •  (*>  =  =  0  if  Nc  is  odd;  x£=}£  =  0  always)  (3. 17-b) 

wlic  r6 

x«=f(c^  £*)  (Clk  Sk^T 

and  ~  L  \  0  V  \slk  s kJ  \sIk  s^J  \Stlrk  s*yj 


xa=F(c?  £jc)fClk  *)—(**  s0.»(°'rk 

~  L\  o  Sj,/  \slk  s^y  \slk  Sj,  y  vs^i,  s^yj 

^  =  [(t  d  e  D-c  i)-c,c  |)]' 


Expression  (3. 14)  can  be  written 

N  1  N  1 

E  E  £,  -  Gw  V  E  xk 

k  =0  o(=o  •‘■a  k=oa=o~Qf 


(3. 18) 


N  1 

=  E  E  Cidxl 

ii=oa=o 


Since  the  product  Cdd  x^  is  another  vector  y ^  of  the  same  "frequency"  k  and  the 
same  a  as  x^,  expression  (3. 18)  can  be  separated  into  2N  or  2N  +  1  systems 
of  equations  : 

-  C44  Hot  •  Xnt  ~  Hot'  •  •  JLa  =  ^4i  Hot"  •  •  ilex =  Cdd  Hot*  ^®-a) 

In  turn,  solving  these  systems  is  the  same  as  finding  the  solutions  to 

|2*(°)  =  R(0)Xa(°).  <*=0  (3.19-b) 

=  R(l)  5La(l)---Ha(k)  =  R(k)*a(k)...u  (N)  =  R(N)WN) 

where 


(3.20-b) 
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3.3  An  Algorithm  for  Solving  the  Normal  Equations  CFT  =  C,T; 

The  optimal  estimator  matrix  F  relating  the  minimum  variance  estimates 
to  the  data  can  be  obtained  by  solving  the  normal  matrix  equations  (1.5)  column 
by  column: 

Qd  il  -  £ /,  (3.21) 

i  l 

where  f_  is  the  transpose  of  the  tth  row  of  F  and  c,,  is  that  of  the  £th  row 

of  C,,  .  So  it  is  necessary  to  solve  N,  systems  of  equations  like  (3.21).  From 

the  previous  section's  analysis  it  is  clear  that  this  can  be  done  by  decomposing 

(3. 21)  into  (at  most)  N«  vectors  of  dimension  N,.  such  as  in(3.i9-a), 

and  then  solving  the  corresponding  systems  ua(k)  =  R(k)xa(k)  of  (3. 19-b). 

It  is  much  easier  to  work  with  the  Nr  x  Nr  matrices  R(k)  than  with  the 

NcNr  x  NcNr  matrix  Ca  ,  even  if  there  are  N  +  1  *■  N^/2  of  the  smaller 

matrices.  The  whole  procedure  can  be  described  as  follows: 

Part  I 


a)  Form  all  R(k)  matrices  (0  £  k  £  N),  by  Fourier  analysis  of  the  first 
row  in  every  submatrix  CtJ  of  Cdd  (expressions  (3.1)  and  (3.2). 

b)  Find  the  corresponding  R(k)-1,  and  save  all  pairs  (R(k),  R(k)-1) 
on  tape  or  disk  if  the  same  type  of  data,  sampled  on  the  same  grid,  is  likely  to 
be  used  in  future  estimates. 

Part  n 


c)  Decompose  the  £th  "right  hand  side"  c,  j  by  Fourier  analysis  of  its 
Nr  partitions,  as  in  expressions  (3.15)  through  (3.17). 

d)  Form  the  Ne  "equivalent  right  hand  side  vectors"  ua(k)  according  to 
(3. 20-b),  and  solve  the  corresponding  N«  equations  (3. 19-b)  to  obtain  the  "equi¬ 
valent  solution  vectors"  J£a(k)  as  in  (3.20-a). 

e)  Use  the  Ne  equivalent  solution  vectors  to  generate,  by  Fourier  synthesis 
(expressions  (3.16-a)  and  (3. 17-a)  )  the  /th  row  of  F. 

f)  Repeat  steps  (c)  through  (e)  for  every  column  in  C,t  ,  until  all  the 
rows  of  F  have  been  found. 

An  alternative  to  inverting  the  R(k)'s  is  to  generate  the  pseudoinverse 
(equivalent  to  R(k)-1  when  this  exists)  of  each  R(k)  by  Conjugate  Gradients. 
This  method,  described  in  Luenberger  (1969),  generates  a  series  of  Nr-vectors 
v1  thatare  conjugate  directions  of  R(k) :  v>TR(k)  v>  =  0,  i  ^  j,  and  that 
can  be  used  to  form  the  pseudoinverse: 

+  R 

R( k )'  =  I  (v1T  R(k)  v‘)"1  v!  vlT  ,  R  =  rank[R(k)]  (3.22) 
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In  cases  when  C^1  (and,  therefore,  at  least  one  R(k)-1)  does  not  exist,  the 
normal  equations  have,  nonetheless,  solution.  This  is  because,  being  covariance 
matrices,  c/t  is  always  in  the  span  of  C**.  This  means  that  there  is  always 
an  exact  solution  to  _ua(  k )  =  R  ( k )  _£  ( k )  that  can  be  obtained 

Ua(k)  =  R+(k>  *ft(k)  (3.23 

This  idea  has  been  used  successfully  to  get  the  results  in  paragraph  5.5. 


3.4  Equatorial  Symmetry 

If  every  row  in  the  grid  has  a  counterpart  in  the  opposite  hemisphere  and 
both  are  at  the  same  spherical  distance  from  their  poles,  then  the  grid  has 
"equatorial  symmetry".  The  equator  itself  (an  equatorial  band,  in  the  case  of 
blocks)  can  be  one  of  the  rows. 

If  the  grid  has  equatorial  symmetry,  then  Can  is  persymmetric:  two 
submatrices  Cu  are  equal1  if  they  are  symmetrically  situated  with  respect  to 
the  main  diagonal  (ordinary  symmetry)  or  the  main  antidiagonal.  More  generally, 
one  can  permute  the  ith  row  with  the  ith  column,  or  the  ith  row  with  the 
Ni  +  1  -  i  th  column,  in  a  persymmetric  matrix  of  dimension  Nj  ,  without  modi¬ 
fying  the  matrix.  Since  R(k)  is  formed  by  taking  one  Fourier  coefficient 
a“  from  each  Ciji  it  follows  that  R(k)  is  also  persymmetric. 

# 

An  important  property  of  persymmetric  matrices,  from  the  point  of  view 
of  this  study,  is  the  following; 

(1)  If  v  is  an  "even"  vector; 

V  =  l  Vj  v3. .  .  V,  ...  vNr  ]T 
V1  =  v\  ;  v3  ~  vn,— i •  •  •  V,  =  v„r_i  for  0  <  i  S  Nw 

(2)  Or  if  v^  is  an  "odd"  vector: 

V  =  1  V;  vs. . .  V,. ..  V„r  1 T 

vi  =  -vNr  I  vs  =  -Vvl...v,»-Vk,r_,  for  0  S  is:  N«  (3.25' 

then  the  product  of  v  by  a  persymmetric  matrix  is  also  "even"  or  "odd",  re¬ 
spectively.  fa  other  words:  multiplication  by  a  persymmetric  matrix  preserves 
the  "parity"  of  the  vector.  Any  vector  b  can  be  decomposed  into  an  "even" 
part  -0=0  a&d  an  "odd"  part  bg_t  : 


(3. 24 

Nr/2  if  Nr  is  even 
(Nr-l)/2  if  Nr  is  odd 


Remember  that  all  CtJ  are  symmetrical  and,  being  Toeplitz  circulant,  are 
also  persymmetrical. 


b0=o=[b?  &...b?...b£pf  b,° 

bg=1  =  [bi  b*  ...  b\...  bir)r  b1, 


*  <bi  +  b_,^v  +  1),  (bCH+i  -  b^,,  if  Nr  odd) 
^  (b,  -  b_1+Nr+1)  (3.26) 


Therefore,  an  equation  of  the  type 


u<y(k)=  R ( k )  xQ(k) 

can  be  separated  into  two  independent  equations.- 

.00  „  ,  (it) 

—a$=0  R<k>  Zrx0  =  o  (3.27-a) 

=  R<k)ia3  =  i  (3.27-b) 


where  /3=0  indicates  "even”,  and  0  =  1  "odd".  The  solution  (3. 27 -a)  must  be 
"even",  while  that  to  (3.27-b)  must  be  "odd",  so  the  actual  number  of  "degrees 
of  freedom"  is  NH  =“  number  of  unknowns /2.  When  half  of  the  unknowns  are  found, 
the  other  half  must  have  the  same  or  opposite  values.  Therefore,  only  the  first 
N„  equations  in  (3.27-a,b)  are  needed  to  solve  the  system.  Instead  of  the  original 
system,  one  can  solve  the  equivalent-. 


-  Vk'*a/J 

~(lr)  *^(k)  ~ 

where  ua^  and  xap  have  dimension  NM  ,  R^(k)  is  NH  x  NM 


k  < 

r„  +  r.N._ 


n+3 


(  -1) 


0 


(3.28-a) 

and  has  elements 

(3.28-b) 


with  1  s  n  s  N„  and  1  ^  m  *  NH,  where  \?a0  contains  the  first  NH  elements 
in  Ea0  an<^  the  first  NH  in  .  There  are  twice  as  many  equations 

such  as  (3.28)  than  there  are  equations  uke  (3. 19),  but  the  reduction  in  size  of 
the  matrices  by  half  brings  a  considerable  increase  in  efficiency. 


4.  Computing 

This  section  considers  the  implementation  of  the  algorithm  for  grided  data 
from  the  point  of  view  of  efficiency  and  of  reliability  of  results.  Also  certain 
numerical  stability  matters  are  treated. 


4. 1  Setting  up  the  Matrix  C,m  =  C„  +  D 

Usually  D  is  a  diagonal  matrix,  so  calculating  its  contribution  to  C4d  is 
trivial.  If  this  is  not  the  case  ("colored  noise"),  this  matrix  is  handled  in  the 
same  way  as  C„.  C„  contains  all  the  covariances  of  the  signal  in  the  data: 
because  the  symmetries  in  the  grid  are  reflected  in  the  structure  of  C„,  it  is 
not  necessary  to  compute  every  one  of  them.  C„  is  symmetrical,  so  only  half 
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of  its  elements  have  to  be  found.  Every  ClS  block  is  Toeplitz  circulant,  so 
only  the  first  row  has  to  be  known,  and  this  first  row  is  "even",  as  already 
explained,  so  only  half  of  its  elements  are  different.  If  there  is  equatorial 
symmetry,  then  C„  is  persymmetric,  and  this  reduces  the  number  of  dis¬ 
tinct  elements  by  half  once  more.  The  number  of  covariances  in  C„  is 
(NeN,)3:  after  all  symmetries  are  considered,  only  NeNrs/4  have  to  be 
computed  if  the  grid  is  not  symmetric,  and  only  Nc  Na/8  if  it  is.  Computing 
each  covariance  should  take  the  same  amount  of  operations,  so  the  central 
processor  time  needed  is  porportional  to  the  number  of  distinct  elements.  This 
means  that  a  matrix  such  as  Clt  requires  4NC  times  less  to  be  set  up  than  an 
ordinary  symmetric  matrix  of  the  same  dimension,  and  8NC  times  less  if  the 
grid  is  symmetrical.  In  the  case  of  a  regular  l°x  1°  grid  with  64800  points 
(blocks)  there  is  a  reduction  in  effort  of  the  order  of  1400  times.  Furthermore, 
the  existence  of  redundancy  in  the  elements  of  Cdd  can  be  used  to  decrease  the 
storage  requirements  for  this  matrix,  that  would  otherwise  be  truly  enormous 
even  with  moderately  large  data  sets. 


4.2  Solving  the  Normal  Equations 

Solving  a  system  of  equations  requires  a  number  of  operations  proportional 
to  the  cube  of  the  number  of  unknowns.  Assuming  that  the  same  method  were 
used  to  solve  the  original  equation  for  one  row  of  F 

f*=  CSc/, 

that  is  used  to  solve  each  of  the  equations 

H*(k)  =  R<k)X<*(k)  or  otjj  =  Rfl(k) 

as  the  case  may  be,  then  the  total  saving  in  computing  time  due  to  the  structure  of 
Qd  is  of  the  order  of  Nc3.  The  use  of  a  symmetric  grid  increases  efficiency  by 
a  factor  of  four.  The  additional  time  needed  to  do  the  "Fourier  analysis"  of  the 
right  hand  sides  or  the  "Fourier  synthesis"  of  the  solution  vectors  is  quite  neg¬ 
ligible,  even  for  large  numbers  of  data  points,  thanks  to  the  existance  of  very 
efficient  algorithms,  particularly  in  the  case  when  Ne  is  a  power  of  2.1  In  the 
case  of  the  regular  l°x  1°  grid,  with  Nc  =  360  ,  the  reduction  of  computing  time 
is  of  the  order  of  130000. 

Computing  the  covariances  that  form  the  Qd  matrix,  even  after  all  sym¬ 
metries  have  been  fully  exploited,  remains  no  trivial  task  if  the  data  points 
(blocks)  are  very  numerous.  In  the  case  of  point  data  this  situation  is  helped 
by  the  existence  of  closed  covariance  expressions  such  as  those  found  by  Tscher- 
ning  and  Rapp  (1974)  for  isotropic  functions.  In  the  case  of  block  data,  as  paragraph 

5.3  shows,  the  covariances  of  area  means  are  "area  means  of  area  means  of  covar¬ 
iances,"  involving  double  area  integrals  over  the  various  blocks.  Numerical  in¬ 
tegration  could  be  used  as  an  approximation,  but  this  would  require  a  major 
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operation  if  the  dimension  of  CAi  is  large.  It  would  be  most  desirable  to  have 
closed  expressions  (or  convenient  approximations)  for  area  means'  covariances, 
but  they  are  not  known  to  this  author.  ‘  Reductions  in  computer  time,  while  val¬ 
uable  in  themselves,  are  not  the  only  important  gain:  fewer  computations  mean 
less  rounding  e  rrors  accumulation,  and  more  reliable  results. 


f 

All  the  properties  of  the  covariance  matrix  mentioned  so  far  apply,  in 
the  case  of  isotropic  and  other  covariances,  not  only  on  the  sphere  but  also  ‘ 

on  any  surface  of  revolution,  as  long  as  the  "rows"  are  defined  by  circles  per-  { 

pendicular  to  the  axis  of  rotation.  Such  surfaces  include:  the  cylinder,  the  i 

cone,  and  the  geodesist's  old  friend,  the  oblate  spheroid.  The  same  structure  ! 

arises  from  concentric  rings  on  a  plane  and,  of  course,  from  the  regular  sam¬ 
pling  of  a  circumference.  Equispaced  sampling  along  a  straight  line  results  in  j 

a  Toeplitz  matrix,  different  from  a  Toeplitz  eirculant  one  in  that  the  last  ele-  \ 

ment  of  a  row  is  usually  "lost"  because  a  different  number  appears  as  the  first  ! 

element  in  the  following  row,  while  all  other  elements  are  shifted  one  place  to 
the  right  as  before.  Equispaced  sampling  on  a  rectangular  grid  in  the  plane 
produces  Toeplitz  block  matrices  of  Toeplitz  blocks.  All  Toeplitz  matrices  can 
be  set  up  and  inverted  efficiently,  with  approximate!)  (dimension)2  operations 
per  inversion.  This  is  also  the  case  with  the  type  of  block  matrices  dis¬ 
cussed  in  this  paper,  as  already  shown.  The  properties  of  Toeplitz-type  |.md 
closely  related  Hankel-type)  matrices  have  been  used  to  devise  algorithms  for 
minimum  variance  prediction  and  filtering  on  the  real  line  (time  domain)  and 
on  the  plane.  Examples  of  the  first  application  are  the  algorithms  of  Levinson 
(1947)  and  Trench  (1964).  For  the  plane  there  is  an  interesting  method  due  to 
Justice  (1977).  In  Geodesy  there  has  been  a  recent  application  of  Toeplitz 
matrices  to  the  prediction  of  ocean  gravity  anomalies  from  satellite  altimetry, 
by  Eren  (1979).  Besides  what  might  be  called  "outright"  Toeplitz  matrices 
(eirculant,  block,  or  plain)  there  are  "near  Toeplitz"  matrices  and  operators 
which  have,  to  a  lesser  degree,  some  of  the  advantages  considered  here.  Such 
structures  have  been  studied  by  Kailath  (1975)  among  others:  it  was  at  a  talk 
delivered  by  him  at  the  University  of  New  South  Wales,  in  early  1976,  that  ihe 
author  of  this  paper  first  became  aware  of  the  many  uses  of  Toeplitz  matrices. 


4. 3  Numerical  Stability 


The  poles  (circular  bloc  ks  about  the  poles)  have  been  excluded  by  restriction 
C-2  (C-2')  because,  in  order  to  partition  C«  into  Nr'  N.-  x  N,  suhma trices  C t j . 
the  corresponding  measurement  would  have  to  be  artificially  treated  as  N.-  meas¬ 
urements  at  the  same  point  (block),  introducing  NV  rows  and  columns  ir  C,u  that 
are  identical,  thus  making  Cm  singular.  Even  with  this  restriction,  row's  verv 
close  to  the  pole  may  create  stability  problems,  as  the  matrix  will  tend  to  become 

One  approximation,  based  on  the  ivl linen  "smoothing  factors"  (rendered  into  a 
recursive  form  by  Meissl  (1971)),  might  be  acceptable  for  very  fine  subdivisions 
of  the  sphere,  though  it  requires  the  use  of  Legendre  harmonic  expansions  trun¬ 
cated  to  a  high  degree. 


-13- 


singular  as  the  rows  approach  the  pole.  In  particular,  small  perturbations  in 
the  elements  of  may  have  serious  consequences  for  the  solution  of  the  nor¬ 
mals.  The  author  found,  when  computing  the  results  presented  in  Example  III, 
where  the  grid  was  confined  to  small  circular  caps,  that  interpolating  linearly 
from  a  table  instead  of  computing  each  covariance  exactly  from  a  closed  ex¬ 
pression  (in  order  to  save  time)  resulted  in  a  matrix  with  some  negative  eigen¬ 
values  when  the  table  entries  were  spaced  at  more  than  0.25km  intervals!  He 
finally  computed  all  covariances  from  the  closed  expression,  and  the  problem 
disappeared.  For  a  discussion  of  the  interpolation  problem,  see  Siinkel  (1978). 

Another  way  of  computing  covariances  approximately  is  by  truncating  their 
spherical  harmonic  expansions,  which,  in  the  isotropic  case,1 * 3  are  of  the  type 


C n(  (0  )  **  f  CIX  n  P„(COS  I l>)  >  C«  t(  )  E  Cg  ,  n  P n  (cos  0 ) 

n=a  *  a-a  ’ 


(4.1) 


at  a  "sufficiently  high"  degree  Ns*  (usually  N,»  =  1000).  If  the  spacing  between 
meridians  is  AX<  tt/nbX  then  Cdd  la  =  0,  where 


f“  - 

~a 


HI)---  *■  (!)•••  ""-(l)]' 


if  N,x  <  k  <  Integer  (tt /AX).  Furthermore,  R(k)=0  if  N»x<k<  Integer 
(  t / AX ).  This  presents  no  problem  if  c,z(tp)  is  computed  using  an  expansion 
truncated  to  N.x,  because  c,t  contains  no  higher  frequencies  in  its 
columns,  which  are  in  the  span  of  C<u  »  and  the  desired  solution  can  be  ob¬ 
tained  using  the  non-zero  R(k),  0«ks  N,,. 


4.4  Eigenvector  and  Eigenvalue  Decomposition  of  Cdd 

One  possibility,  when  dealing  with  an  ill-conditioned,  real  symmetric  matrix, 

C  ,  is  to  decompose  it  into  eigenvectors  and  eigenvalues: 

*ank  C  j 

c  =  r  x ,  t [i  Mi 

i=i 

(where  the  ^  are  the  eigenvectors  and  the  eigenvalues)  and  then  form  a 

pseudo  inverse 

C  =  E  X,,  fiq 

1  =1 

where  the  Xq  are  those  eigenvalues  of  C  that  satisfy  the  condition  X,  >  e  >  0 

1  In  the  case  of  area  means  this  problem  can  be  alleviated  by  using  equal  area  grids 

with  blocks  of  constant  longitude  span,  and  latitude  spans  that  increase  towards 
the  poles. 

3  Pn  is  the  nth  unnormalized  Legendre  polynomial;  cZSt(.  and  csz>n  are  the  nth 
degree  variance  of  z  and  the  n  th  degree  covariance  between  z  and  s,  re¬ 
spectively. 


for  e  "sufficiently  small".  In  other  words:  the  true  inverse  is  "truncated”  to 
that  part  of  its  expansion  that  can  be  regarded  as  "sufficiently  positive  definite". 
For  this  and  other  reasons  it  is  interesting  to  know  the  eigenvector/eigenvalue 
decomposition  of  . 

From  Section  3  it  follows  that  if 

m)i  =  jU(  m)i,\r  ) T 

is  an  eigenvector  of  R(m)  and  X(m)i  is  the  corresponding  eigenvalue,  then 
X  (  m)j  is  also  an  eigenvalue  of  Cdd  ,  and 

v“(m),  =  M(m),^r(|^)]T 

the  associated  pair  of  eigenvectors  of  .  Therefore,  the  spectral  decomposition 
of  the  Nc/2  matrices  R(m)  is  equivalent  to  that  of  Cja  .  However,  it  is  far 
easier  to  decompose  Nc/2  Nr  x  N-  matrices  than  to  do  the  same  to  one  NrNc  x 
Nr  Nc  matrix. 


4.5  Regularization  of  the  Normal  Equations 

Sometimes,  when  a  matrix  C  is  too  ill-conditioned  for  the  solution  to  the 
corresponding  system  £  =  C  x  to  be  computed  reliably,  a  simple  form  of  regu¬ 
larization,  that  gives  more  stable  results  to  a  slightly  different  problem,  con¬ 
sists  in  solving  the  system  -  (  C  +  a  I)  x  (where  a  is  a  very  small,  positive 
constant)  instead  of  the  original  equations. 

The  "trick"  consists  in  finding  a  value  of  a  that  stabilizes  the  results  with¬ 
out  causing  them  to  depart  too  much  from  those  for  the  original  system.  This 
can  be  done  with  relative  ease  if  a  spectral  decomposition  for  C  is  available. 

In  the  case  at  hand,  the  normal  equations  will  have,  after  this  regularization,  a 
"normal  matrix"  Cdd  +  oc  1^  x  N^.  This  can  be  "Fourier  analyzed"  as  before, 
yielding  the  Nc/2  matrices 

R'(m)  =  R(m)  +  nrI(Nr*Nr)  (4.3) 

so  the  regularization  of  Cdd  implies  that  of  each  R(  m) ,  which  can  be 
handled  as  in  Tikhonov  and  Arsenin  (1977). 


4.6  Grids  of  Higher  Symmetry 

The  high  efficiency  in  setting  up  and  solving  the  equation  Cdd  FT  =  C$\ 
made  possible  by  the  structure  of  Cdd  raises  the  question  of  the  possible  exis¬ 
tence  of  partitions  of  the  sphere  that  generate  even  stronger  structures.  The 
answer  is  yes,  and,  as  examples,  consider:  a  single  "row",  two  "rows"  sym- 


2 

l 


metrical  with  the  equator,  and  the  vertices  of  the  five  regular  (and  of  the  13 
semi-regular)  solids.  In  all  these  cases  the  matrix  Cdd  is  a  block  Toeplitz 
circulant  matrix  of  circulant  blocks,  while  the  matrices  considered  so  far 
were  simply  block  matrices  of  circulant  blocks.1  With  the  exception  of  the 
second  arrangement,  the  size  of  the  CtJ  is  1  x  1,  which  reduces  the  whole 
matrix  to  an  ordinary  Toeplitz  circulant  matrix,  the  setting  up  and  inverting 
of  which  is  almost  trivial.  Are  there  such  grids  with  large  numbers  of  nodes 
(blocks)  evenly  distributed  over  the  whole  sphere?  The  answer  to  this  question 
is  not  known  to  the  author,  but  its  importance  can  be  appreciated  by  the  reader. 
Paulik  (1976)  has  published  a  theorem  containing  a  sufficient  condition  for  the 
existence  of  this  type  of  grid,  as  well  as  a  constructive  principle,  linking  its 
existence  to  that  of  pairs  of  commuting,  nontrivial,  3x3  orthogonal  matrices. 
Whether  some  of  these  pairs  correspond  to  dense  grids  is  another  matter. 

If  Cdd  is  block  Toeplitz  circulant  of  circulant  blocks,  then  the  elements  in 
each  row  (column)  are  the  same,  only  their  order  changes.  In  the  case  of  iso¬ 
tropic  covariances,  this  means  that  the  set  of  distances  from  any  data  point 
(block)  to  all  the  others  must  be  independent  of  the  data  point  (block)  chosen. 
Clearly  this  is  a  necessary  condition. 

5.  Examples 

This  section  illustrates  the  application  of  the  method  to  spherical  harmonic 
analysis  of  gridded  point  data  and  of  area  averages,  and  to  estimating  disturbing 
potential  from  gravity  anomalies. 


5. 1  Spherical  Harmonic  Analysis  of  Point  Data 

Spherical  harmonic  analysis  is  to  data  distributed  on  a  sphere,  what  Fourier 
analysis  is  to  data  on  the  line  or  on  the  plane.  Not  only  does  it  provide  greater 
insight  into  the  properties  of  the  information  available,  its  statistics,  and  its  re¬ 
lationships  to  other  signals  (see  Kaula,  1967),  but  it  also  allows  the  highly  efficient 
computation  of  convolutions.  Such  is  the  case  of  a  function  s 

CO  n  _  _ 

h(<o,X)  =  E  E  ?M(0)(c„,cos  mX  +  s„,  sin  ml)  (5.1) 

n  =o  ■s=0 

that  is  transformed  according  to 

If  the  data  were  portioned  by  meridians  instead  of  by  parallels,  Cdd  would  be 
a  block  Toeplitz  circulant  matrix,  but  the  blocks  themselves  would  not  be  Toeplitz. 
P™  is  the  associated  Legendre  function  of  the  first  kind,  of  order  n  and  degree 
m  (normalized);  c  and  s„,  are  fully  normalized  coefficients;  P„  is  the  un¬ 
normalized  Legendre  polynomial  of  degree  n  ;  da  indicates  an  area  integral 
over  the  whole  sphere;  0  is  the  spherical  distance  from  (£,X)  to  (C.X’) 

(see,  for  instance,  Hobson  (1965)),  Pn,(0)  is  shorthand  for  P^sinO). 


u(<o,X)  =  ~j“J*  S  ( ifr )  h(0  X' )  do  (as  in  Stokes'  formula)  (5.2) 

for  some  S(  ii)  =  T  k,  (  2n  +  1)  PB(  vi 

in  which  case 

U(«0,X)  =  r  £  k„  p„,(«5)  [C  ..cos  mX  +  s  sin  mX  J  (5.3) 

n  =0*  ~0 

Computing  (5.2)  by  n"  nerical  quadratures  is  far  more  laborious,  if  u  is  required 
at  many  points,  than  using  (5.3)  truncated  to  a  high  degree  and  order,  if  the  coeffi¬ 
cients  cM  ,  "Sc»  are  known.  Finding  these  coefficients  accurately  and  with  a  min¬ 
imum  of  computations  is  a  very  desirable  goal;  a  number  of  studies  have  been 
published  in  recent  years  on  the  "correct"  way  of  analyzing  data,  particularly 
when  given  in  the  form  of  area  means  (see,  for  instance,  Rapp  (1977)  and  Kat- 
sambalos  (1979)).  Much  of  the  effort  has  been  concentrated  on  computing  the 
coefficients  from  the  expressions 


=  4^  Jo  R«(<0*X)  da 

I.  =  f  SM  (O.X)  h(«,X)  da 

‘t  n 


(5.  4-a) 
(5.  4-b) 


based  on  the  orthogonal ity  of  the  harmonics  ,X)  =  PM(sin  to)  cos  mX  and 

Sa,(to,X)  =  P^sin  <p)  sin  mX  on  the  sphere,  using  numerical  quadratures.  Such 
approaches  can  be  very  efficiently  implemented:  in  1976  C.  Rizos  and  the  author 
wrote  Fortran  programs  for  harmonic  analysis  and  synthesis.  As  an  example, 
one  of  those  programs  took  1.3  minutes  to  generate  a  set  of  cn,’s  and  8,,'s  com¬ 
plete  to  degree  and  order  180  from  64800  lcx  1'  area  means  (gravity  anomalies) 
in  the  AMDHAL  470V/6-II  computer  of  the  Ohio  State  University. 

Because  the  data  is  sampled,  there  is  usually  not  enough  of  it  to  estimate  the 
coefficients  exactly:  the  resulting  error  is  known  as  aliasing,  and  it  depends 
both  on  the  data  distribution  and  on  the  numerical  technique  used.  Moreover,  the 
data  usually  contains  spurious  signals,  measurement  errors  for  instance,  that 
also  affect  the  results.  A  way  of  computing  harmonic  coefficients,  minimizing 
the  effect  of  noise  and  aliasing  simultaneously,  has  been  described  by  Rummel 
(1976)  and  by  Sjoberg  (1978).  The  idea  is  to  estimate  the  coefficients  using  he 
minimum  variance  method,  or  collocation,  which  involves  solving  the  normal 
equations  FT  =  C,'2  for  this  particular  problem.  Under  the  restrictions 
listed  earlier  on,  which  means,  for  instance,  using  an  ordinary  regular  grid 
and  isotropic  covariances  as  defined  by  (4. 1)  with  all  data  on  the  same  sphere,  the 
Qa  matrix  has  the  advantageous  properties  mentioned  so  far,  and  can  be  treated 
accordingly.  What  of  the  CSI  matrix?  The  c„a  and  s"n,  are  functions  of  the 
system  of  coordinates  chosen:  rotating  the  X  origin,  and  shifting  the  poles 
change  their  values.  If  <0  and  X  are  the  coordinates  of  the  shifted  N  pole 
with  respect  to  some  fixed  system,  then  c  .s  =  c  n,(«fl,X)  and  sra  sn>(<o,X) 
can  be  regarded  as  functions  of  the  pole  coordinates:  ordinary'  functions  of  cfl 


1  At  the  Department  of  Geodesy,  School  of  Surveying,  The  University  of  New 
South  Wales,  Australia. 
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and  X  to  be  estimated  at  the  "North  pole"  of  the  grid.  Assuming  that  Mf  } 
is  the  isotropic  operator,  using  (5.4)  and  the  orthogonality  properties  of  surface 
harmonics: 


M{cslh(tf„A,)}  =  —■Mlj"  Rn.(V.X)h((OrA)  h(e>.  ,A.)  da] 

=  T-f  RPS  M[  h(co,  X)  h(o , ,  Xj)  }  da 

4tt  J  q 

=  ~f  | „  cn  P0 da,  (p  =(#,X),  Q  =(Oi,X.)) 

=  (2n  + 

Similarly, 

M  f  s„a  h(o  ( ,  X  j ) }  =  ~  “  r-  S*  i » X , )  , 

(in  +  1) 


So 


M 


(2n  +1) 


/Cos  \  2" 

(sin  Oi)(, 


sin 7 


N 


mj 


(5.  5) 


From  (5.  5),  it  is  clear  that  the  columns  of  Cs",  are  already  separated  in  frequency, 
with  m  taking  here  the  place  of  k  for  convenience  in  notation.  In  the  case  of 

equatorial  symmetry  u*  0(m)  =  [  P-,( ),  Paa(ip2) . P.,(<pNr)] r  and  u^=1(m)  = 

[  Prl(<Pi), . . . ,  Pm(0n  )  P  are  identical  and  either  "even"  or  "odd"  vectors,  because 
the  P.a  are  even  functions  of  <p  if  n~m  is  even,  and  odd  functions  if  n-m  is  odd. 
In  any  event,  u  _e(m)  =  ua=1(m)  regardless  of  symmetry.  The  complete  decom¬ 
position  of  the  columns  of  Cl*  (right  hand  sides)  is  immediate:  all  that  has  to  be 
computed  are  the  values  of  P,J<Pi)  for  is  is  Nr ,  for  which  there  are  simple 
recursive  formulas  by  order  and  by  degree.  After  solving  the  reduced  equations: 

vn(m)  =  R(m)^n(m)  or  Ug(m)  =  R(m)-^g(m) 

(depending  on  whether  the  grid  is  symmetric  or  not  w.  r.  t.  the  equator),  where  the 
subscript  a  is  superfluous  and  has  been  dropped,  the  "synthesis"  of  the  corres¬ 
ponding  row  in  F  is  also  immediate,  say 


jn  the  eijuatorially  symmetric  case,  where  is  the  row  in  F  corresponding  to 

c„  or  sna.  Once  the  \g(m)  are  known,  the  estimates  are  obtained  as  follows 
(again  in  the  symmetric  case). 


»  /CtN 


=  > 


JlxS....(b?)+  t  -  Np+l-i  (5.6) 

*  t  _ ;  2~  mi 

a.  =  c,  d.  -  7  cos  ———*  d.,  (5.7-a) 

—  —  ;  0  N  ■ 

b*  -  s_  dt  -  ^  sin  “  ■ d,  ,  (5.7 -b) 


-  1* 


Expressions  (5.  7)  represent  the  Fourier  analysis  of  each  N0  partition  of  the 
Nr  ft -data  vector  d.  So  the  algorithm  for  getting  and  ,  once  the  values 
of  the  X^(m)  are  known,  is  as  follows: 

a)  Find  the  (  ‘  1  by  Fourier  analysis  of  the  d, ; 

v  b(  ' 

m 

b)  Use  the  x]g(ni)  and  ( j^.)  as  in  expressions  (5.6)  and  (5.7)  to  find  the 
estimated  cTn,  and  a„ . 

For  reasons  explained  in  the  next  paragraph,  m  is  to  be  limited  to  the  range 
l  N0/2  if  M;  is  even 
0  4  m  *  N  -[  (Nc-l)/2  if  No  is  odd. 

Steps  (a)  and  (b)  can  be  performed  veir  efficiently,  even  for  large  numbers  of 
data  points,  because  of^the  power  of  the  Fourier  algorithms  available  at  present. 
Clearly,  all  cna  and  S’,,,  of  the  same  order  m  can  be  found  quite  independently 
from  the  rest  of  the  coefficients.  This  separability  in  order  is  also  present  in 
the  calculation  of  coefficients  by  least  squares  adjustment  on  a  regular  grid. 

Such  adjustment  is  based  on  the  "observation  equations" 

Inthis  technique,  as  in  collocation,  there  is  also  separation  by  frequency  and  parity.  When 
n  *  N  and  m  =  N  are  the  largest  degree  and  order  present  in  the  data,  the  least 
squaies  method  yields  perfect  estimates  (assuming  no  noise).  Collocation,  being 
a  minimum  variance  technique,  also  gives  perfect  estimates  when  applied  to  such 
data.  An  important  difference  between  die  two  procedures  is  that,  for 

h(«p,X)  *  |  S  |C„,  S„(W,X)  +s„.S„.(<p,X)] 

a  “0  *—  u 

least  squares  has  no  control  on  the  aliasing,  while  collocation  minimizes  it 
(provided  the  appropriate  covariances  are  used). 


5.2  Aliasing 

A 

/  (jr  v 

When  data  is  noiseless,  the  error  in(i£-"“)  is  a  function  of  higher  degree  and 
order  coefficients.  This  type  of  error  is  commonly  known  as  aliasing:  high  fre¬ 
quency  waves  become  indiscernible  from  lower  frequency  ones  because  of  the 
sampling.  To  understand  this  problem,  aliasing  can  be  considered  first  by  degree 
and  then  by  order. 

I  -  Higher  degrees:  There  are  Nr  samples  in  latitude,  so  there  are  at  most  Nr 
independent  columns  in  C.rx  of  the  form 

“  |Pn«(tOl  )(  “  )>•••»  Pn»(^r)(”  )  1 

for  any  given  m  .  Therefore,  if  n  ^  Nr  , 


-19- 


n«  _  Ni  1* 

=  141  a<£*i,a 

for  some  real  numbers  at.  If  Cdd  Is  invertible,  F  has  the  same  rank  as  Ctz, 
and  there  are  only  Nr  independent  f.  =  cl]  c]*>0,.  If  n  >  Nr 

-a  =  Cdd\li  =  ,?x  cdd  £.'*,» a i  =  ttx  f.1^  a, 

(fc)  s-.l.  •>.(!;;) 

or,  after  several  more  steps, 


and 

fn*  = 
-a 

■■  Cdd\E, 

(ln,S) 

_  f  niT 

\Sn./ 

"  “0( 

(|“)  =  ?  a-  (Jn‘) 

v  Sla  n  =Nr+l  V  Sn>/ 


(5.8) 


for  some  real  numbers  </tn  .  Expression  (5.8)  indicates  a  dependence  of  the 
estimated  coefficients  on  coefficients  of  higher  order.  The  actual  coeffi¬ 

cients  are  quite  independent  from  each  other. 


II  -  Higher  orders:  There  are  Nc  samples  per  row,  so  only  Nc/2  1  terms  in 
cos  rrAj  and  Nc/2  in  sin  mXj  (defined  in  (5.5))  can  be  independent  simultaneously. 
This  means  that  terms  of  different  frequency  are  lumped  together,  as  shown  by 
the  trigonometric  relationship: 


(  m  +  Nn  k  )  q 


where  k  =  0,1,2,,,..  Consequently,  both  the  rows  in  C.t  and  in  F  correspon¬ 
ding  to  (%na\  with  m>  Nc/2  will  have  the  form  , 

\  »  / 


(where  m  =  m'  +  (N;  k)/2)  and  will  be  linear  combinations  of  the  Nr  independent 
columns  with  the  same  lower  order  m'  .  It  follows  that  there  is  dependency  between 
coefficients  of  order  m'  and  those  of  order 


m  =  m'  +  Ne  ,  m'  +2nc  ,  m'  +  3N,. . m'  +  kNc  ,. . . 

From  ( I)  and  ( II)  it  is  clear  that  estimates  of  degree  n  <  Nr  and  order  m  <  Nc/2 
are  functions  of  higher  frequency  terms.  This  dependency  usually  gets  worse  as 
the  upper  limits  are  approached  by  n  and  m  .  In  regular  partitions,  where 
A<0  =  AA  ,  so  Nr  Nc/2  ,  this  common  upper  limit  for  degree  and  order  is 
usually  referred  to  as  the  "Nyquist  frequency"  Nc/2  ,  after  its  time  series  analogue. 

Aliasing  is  a  problem  present  in  all  forms  of  harmonic  analysis,  including 
that  on  the  sphere.  Some  methods,  however,  are  less  affected  by  it  than  others. 

For  noiseless  data,  aliasing's  the  error  in  the  estimates  of  the  coefficients.  So, 
if  the  mean  square  error  in  each  coefficient  is  the  criterion  for  measuring  aliasing 

1  Nc/2  is  only  correct  within  one  unit  of  the  actual  number.  "Nc /2  "  is  used  here 
and  in  the  remainder  as  a  simplification. 
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for  that  coefficient,  then  for  a  given  class  of  functions  h  (0,  A)  with  covariance 


1 


Mfh(P)h<Q)}=  1  C,P  B(i|)pq) 

or  degree  variances  (power  spectrum)  (Tn  —  cR,  minimum  variance  analysis 
(or  collocation)  is  the  procedure  with  the  least  possible  aliasing. 


5.3  Spherical  Harmonic  Analysis  from  Area  Means 


Sjoberg  (1978)  has  derived  expressions  for  the  isotropic  covariances  of 
area  means  of  gravity  anomalies  over  blocks  of  the  type  shown  in  Figure  2. 1-b,  and 
for  the  covariances  between  such  area  means  and  the  normalized  harmonic  coeffi¬ 
cients  of  the  signal  before  averaging.  Generalizing  those  formulas  to  area  means 
h  of  a  function  h  ,  we  have 

“&M  -15^  L,  ttod!'  «5-9> 

where  Aot,Aorj  are  the  areas  of  the  ith  and  j  th  blocks,  and 

,  ,5..-,  ,  ,  _  _c._  »><”•«*’ 

(2n  +  1)  (sin <pN ^  -  sin  #Sj)  x 


x 


i  1  if  m  3  0  ,  a  =  0 
( (l/m)[a,,(AA)  (g°in)mAWj  +  (-1)  b,(AX)( 


sin 

cos 


.) 


m\Mj  ] 


where  a, (AX)  =  sin  mAX 

b.(AX)  =  (1  -  cos  mAX) 

XWj  -  West  most  longitude  in  j  block 
(Oh  j  =  North  most  latitude  in  j  block 
<ffSj  =  South  most  latitude  in  j  block. 


Expression  (5. 10)  shows  that  the  "Fourier  analysis"of  the  partitioned  rows  of  C,t 
is  immediate,  resulting  in  a  "sine"  and  "cosine"  pair  per  row,  twice  as  many  terras 
as  in  paragraph  5.1.  Furthermore,  if  the  grid  has  equatorial  symmetry, 
the  integrals 

P„(«J)  cos  (0  do 

J<a  s, 

nm  J 

and  the  resulting  c„fl  vectors  split  naturally  into  even  and  odd,  as  before.  Except 
for  the  existence  of  "sine"  and  "cosine"  pairs  arising  from  each  column  in  Ciz  , 
the  basic  algorithm  is  applied  in  the  same  way  as  for  point  data.  All  important 
considerations  made  in  the  previous  example  apply  to  the  analysis  of  area  blocks, 
particularly  those  pertaining  to  aliasing. 


There  remains  the  question  of  which  way  of  representing  the  data,  usually 
not  sampled  on  a  regular  grid,  is  least  affected  by  aliasing:  whether  coefficients 
estimated  from  computed  area  means  are  necessarily  more  accurate  than  those 
calculated  from  point  values  interpolated  on  a  grid.  This  is  not  a  simple  question, 
though  "common  sense"  (something  to  be  handled  with  extreme  caution)  suggests 
that  area  means  are  probably  better.  This  is  so  because  aliasing,  as  already 
explained,  depends  on  high  frequency  terms  that  are  "damped  out"  by  the  aver¬ 
aging,  while  low  frequency  terms  are  only  slightly  modified.  In  other  words,  the 
"signal",  or  low  degree  coefficients  to  be  estimated,  tends  to  be  greater  than 
the  "noise",  or  higher  degree  terms  that  are  not  estimated. 1  The  understanding 
of  the  spectral  characteristics  of  data  averaged  over  square  blocks  is  not  suffi¬ 
cient,  to  date,  to  give  a  more  definite  answer.  One  thing  is  clear,  however:  if 
the  coefficients  were  estimated  by  collocation  from  the  original  ungrided  data  the 
resulting  mean  square  error  would  be  the  least  for  any  linear  estimator  utilizing 
the  same  data,  including  those  that  first  "grid"  (average)  the  data  and  then  "col¬ 
locate"  the  coefficients  from  the  grided  (averaged)  data  set.  This  resulting 
error  will  include  the  effect  of  measurement  errors  and  of  aliasing. 


5.4  Collocation  and  Numerical  Quadratures 


Expression  (5.6),  in  the  general  case  of  a  nonsymmetric  grid,  becomes 


mj  du 


(5.11) 


where  x^m)  are  obtained  by  solving  (3.11).  lf_the  dat.  is  point  data  (to  simplify 
the  discussion),  a  common  way  of  computing  (§**)  is  by  "discretizing"  the  basic 
expressions 


(1»)  =  ±- 

Vsnl/  4tt 


r  P 

J(7  Pn 


(COS  \ 

sin/  dcr 


in  the  form 


(I;;)  - 


h  (0 1  ,Xj)  =  dij 

(2n/Nc)  j 


(5. 12) 


(5.13) 


where  the  WtJ  are  "quadrature  weights",  often  the  area  AOtj  of  each  block. 
Comparing  (5. 13)  to  (5. 11)  and  assuming  all  Pn»(Oi)  ^  0  ,  one  can  write: 


" lo1  W‘*  pa.  < 1 *  >  >  <  sto  >  niX,  h  ( 0  „  Aj) 


(5.  14-a) 
(5.  14-b) 


1  There  is,  however,  loss  of  information.  If  a  grid  such  as  that  in  Figure  2. 1-b  is 
used  (constant  A  A)  then  all  harmonics  of  order  m  *  Nek,  k  =  l,2, .... 
are  "averaged  out"  of  the  area  means. 
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where  the  Wi*  are  the  "collocation  quadrature  weights".  \s  already  explained, 
collocation  provides  the  least  aliased  coefficients  and,  for  noise  data,  also  the 
best  filtering,  of  all  linear  estimation  procedures.  According  to  (5.11;),  iru 
numerical  quadratures,  regardless  of  the  "weights"  used,  one  computes  (  £"») 
as  linear  combinations  of  the  data,  so  estimation  by  quadratures  is  linear. 
Expressions  (5. 14-a,b)  can  be  used  as  a  justification  for  considering  collocation, 
in  this  context,  as  a  numerical  quadratures  technique  with  optimal  weights  for 
reducing  both  aliasing  and  the  effect  of  data  errors. 


5.5  Estimation  of  Disturbing  Potential  from  Gravity  Anomalies 


The  value  of  a  function  u(<o,X)  at  the  "North  pole"  of  a  grid,  equals  the 
sum  of  the  unnormalized  ca0  or  "zonal"  coefficients  of  its  harmonic^  expans  ion. 
For  this  reason,  estimating  such  value  is  equivalent  to  estimating  (i£0  c„0  ,  and, 
when  the  data  is  grided:  d(<0i,Xj)  =  h(<plt\s)  +n(pj,Xj),  this  means  solving 
y"(0)  =  R(0)  ^*(0)  for  all  the  Nr  "independent" 

28(0)  =  [  PBo(sin  . Pn0(sin<9„r)]T 


and  then  computing 

00 

iJo  Cn0  ~ 


\  A  N  «r 

»=0  ft  =0  1  =  1 


0 

Xi 


Nc-1 

:Iod» 


This  is  a  simple  explanation  of  why  only  the  "zero  frequency"  part  of  the  estimation 
algorithm  has  to  be  carried  out. 


There  is  no  need,  however,  of  estimating  each  ctt0  separately.  Assuming 
that  the  covariance  is  Isotropic 

M{u(P)  h(Q)  }  =  J0  Cuh.n  Pn(iM 

then,  if  P  is  at  the  pole,  the  covariance  is  constant  for  Q  at  any  position  in  the 
same  row.  The  matrix  C,t  becomes  a  Nr  Nc -vector  of  the  type: 

C, ,  (  0  )  ”  (  c0,  y  2  C  Q,  «  ■  .  ,  cn ,  .  .  •  ,  Cq  ] 

where  c0  =  [1  1  1  1  ...  1]T  is  a  "zero  frequency  Nc -vector".  Consequently, 

matrix  FT  becomes  the  NpN« -vector 


where 

and 

So 
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=  X?  where  d,  =j|0d1J 


(5.  15) 
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Expression  (5. 15)  can  be  interpreted  as  representing  the  optimal  estimate  of 
u(P)  based  on  the  "ring  sums"  dtJ  .  It  is  easy  to  verify  that 

*T(0)  =  Cla'(Cdd  +  B)"1  =  uT(0)R(0)_1 

~  Me-! 

where  D  is  the  variance-covariance  matrix  of  n=  j£0n(J  so  this  interpretation 
can  be  used  quite  consistently.  The  main  consequency  of  all  this  is  that  u  (  P)  can 
be  estimated  from  the  NP  34 's  by  inverting  the  Nr  x  N,  matrix  R(0),  instead  of 
the  ( Ne  Nr )  x  (  ft  Nr )  matrix  Cdd ,  which  can  represent  great  savings  in  computing, 
and  a  corresponding  decrease  in  rounding  errors.  Setting  up  R(  0 ) ,  on  the  other 
hand,  requires  just  as  much  effort  as  forming  C4d »  because  the  same  number  of 
individual  covariances  has  to  be  computed  for  both. 

This  idea  was  used  by  the  author  as  part  of  his  research  on  the  creation  of 
a  world  height  system  (see  Colombo,  1979),  and  it  will  be  explained  more  fully 
in  a  future  report  on  that  project.  The  particular  application  was  predicting 
disturbing  potential  T  from  gravity  anomalies  Ag  inside  a  spherical  cap  sur¬ 
rounding  the  point  of  computation.  The  semi  apertures  of  the  caps  studied  were 
5°  and  10°,  the  rings  being  spaced  at  nearly  0.4°  intervals.  To  keep  the  sep¬ 
aration  between  "gravity  stations"  roughly  constant,  the  rings  were  progressively 
decimated  in  azimuth  towards  the  center.  While  this  departure  from  the  type  of 
grid  considered  so  far  invalidates,  in  a  strict  sense,  the  equivalence  between 
"point"  data  collocation  and  "ring  averages"  collocation,  the  effect  on  the  results 
is  very  small.  The  covariances  were  computed  using  the  "two-terms"  covariance 
functions  obtained  by  Jekeli  (1978),  that  have  finite  recursive  form. 

Setting  up  the  cdd  matrix  took  near  5  seconds  for  the  5°  cap  and  30  seconds 
for  the  10°  cap;  with  12  "rows"  and  475  points  in  the  first  case,  and  24  "rows" 
and  3000  points  in  the  second.  The  times  for  finding  the  optimal  estimator  F 
(including  the  inversion  of  R(0))were  of  the  order  of  1  second  in  both  cases. 

These  are  C.P.U.  times  using  the  Ohio  State  University's  AMDHAL  470V/6-II 
computer. 


6.  Conclusions 

Minimum  variance  linear  estimation  from  grided  data  can  be  implemented 
effectively,  even  when  the  number  of  measurements  is  very  large,  exploiting 
the  structure  induced  in  the  C<m  matrix  by  the  regularities  of  the  grid.  This  is 
true  of  both  "point  data"  and  of  "area  means".  The  restrictions  imposed  on  grids 
and  on  covariance  functions  do  not  exclude  those  used  in  most  applications:  the 
"regular”  grid  and  the  "isotropic"  covariance.  The  greatest  constraint  is  on 
the  "random"  part  of  the  noise,  that  has  to  have  the  same  standard  deviation  at 
all  points  in  the  same  "row". 

There  might  be  other  types  of  partitions  of  the  sphere  that  result  in  even 
stronger  structures  for  CddI  allowing  yet  more  efficient  algorithms  for  forming 
and  inverting  this  matrix.  Research  to  this  end  could  be  both  pleasant  and  profit¬ 
able. 


The  reduced  number  of  computations,  when  the  features  of  Cj,  are  ex¬ 
ploited,  means  not  only  shorter  computer  runs,  but  also  more  accurate  results. 

The  algorithm  presented  here  could  make  possible  the  computation  of 
spherical  harmonic  models  from  global  data  sets  to  very  high  degree  and  order 
and  with  minimum  aliasing. 

Even  after  all  symmetries  in  C<m  are  exploited,  creating  this  matrix  is 
now  the  major  task  when  doing  collocation  with  very  large  data  sets.  This  is 
particularly  serious  in  the  case  of  area  means,  as  the  covariances  are  the 
"point"  covariances  integrated  twice  in  two  dimensions,  as  shown  in  expression 
(5.9).  It  would  be  truly  useful  to  obtain  either  closed  expressions  for  this  co- 
variance,  along  the  lines  of  Tscherning  and  Rapp  (1976),  or  else,  approxi¬ 
mate  expressions  that  are  inexpensive  to  compute. 

Storage  requirements  can  be  reduced  drastically,  as  only  half  of  the  first 
row  of  some  of  the  Nc  x  Nc  partitions  Cij  of  Cdj  are  truly  needed;  while  the 
R(k)  and  their  inverses  are  relatively  small  matrices.  The  separation  of  the 
normal  equations  into  "frequencies"  (expression  (3. 19-b))  makes  this  approach 
easy  to  implement  in  a  parallel-processing  machine. 

Matrices  with  the  same  type  of  structure  considered  in  this  work  appear 
in  interpolation  and  filtering  with  symmetric  kernels  of  various  types,  besides 
covariance  functions.  In  particular,  this  is  true  of  such  things  as  point  mass 
models  when  the  points  are  distributed  on  regular  grids. 

Already,  the  ideas  presented  here  have  been  found  useful  in  estimating 
disturbing  potential  from  gravity  anomalies,  because  of  the  economies  in  com¬ 
puting  time  they  make  possible.  Soon  a  program  for  spherical  harmonic  analysis 
based  on  the  same  principles  will  be  attempted. 
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Appendix 

As  explained  in  Section  2,  the  matrix  of  noise  covariances  D  is  supposed 
to  consist  of  Toeplitz  circulant  blocks,  like  C« .  If  the  noise  is  uncorrelated 
from  measurement  to  measurement  (area  mean  to  area  mean)  then  D  is  diagonal. 
To  satisfy  the  Toeplitz  condition,  this  matrix  should  have  those  diagonal  elements 
corresponding  to  points  on  the  same  parallel  row,  at  least,  equal.  In  general, 
this  will  not  be  the  case.  Since  approximate  results  may  be  better  than  none, 
it  could  be  a  good  idea  to  use  a  modified  matrix  D'  instead  of  D ,  satisfying 
all  requirements  without  being  "too  unlike"  D .  Two  ways  of  doing  this  are 
considered  next. 

I)  D'  =  0 


The  noise  might  be  ignored  altogether,  when  the  quality  of  the  data  is  very 
good.  Then  the  problem  disappears,  because  D  =“  D’  =  0.  However,  it  is 
desirable  to  know  the  effect  on  the  estimates  £  of  the  noise  n 

s  =  F  d  =  F(z  +  n) 

Fr  =  (C„  +  D')'1  C;t  =  c;i  C.\  (A.  1) 

F  is  suboptimal,  because  the  noise  matrix  has  been  omitted  in  (A.l).  The 
variance  covariance  matrix  E  of  the  estimator  error  is 

E  =  Mf(s  -  Fd)(s  -  Fd)T|  =  C„  -  FC/,  -  C,,  F  +  F(C„  +  D)  FT 

(expression  (1.  2-a»  which  can  also  be  written  as 

E  =  E.  +  E„  =  [C..  -  FC,',  -  C„F  +  FC„FT]  +[FDFt1  (A. 2) 


Here  E,  represents  the  error  due  to  the  way  the  data  is  sampled  (aliasing),  and 
E„  is  the  contribution  of  the  "propagated"  noise.  Forming  E,  E.,  and  En  re¬ 
quires  calculating  a  number  of  matrix  products,  the  most  involved  of  which  is 
F  C.«  F  .  Another  matrix  product  of  interest  is  Cti  FT ,  needed  for  the  second 
part  of  this  Appendix.  These  products  can  be  obtained  efficiently  by  exploiting 
the  structure  of  C„ ,  which  is  the  same  as  that  of  C^'  .  Therefore,  the  method 
for  solving  the  system  of  equations  ^  =  C::  x  (i.  e.,  finding  CTt1^)  can  be 
applied  to  obtaining  Clt  £  as  well,  with  minor  modifications.  To  this  end,  the 
algorithm  described  in  paragraph  (3.3)  has  to  be  changed  as  follows; 


a)  Use  the  matrices  R(k)  and  the  vectors 
(where  f"  =  Z  F  is  the  nth  column  of  FT 

—  k=0Qf  =  0 

during  the  determination  of  F  ,  to  calculate 


fxor,l.^g£  )••••»  1 

),  which  have  been  obtained 


(A. 3) 


-28- 


where 


b)  Form  the  columns  of  H  =  C£.  F7  by  ’’Fourier  synthesis": 

hn  =  C„fr‘  =  1  t  e>  n  (A.  4) 

-  -  k  =o  cf-  o  ~a , 

where  <&,.  -  (ij) . Wl')1’ 

b’)  In  particular,  the  elements  of  J  -  FC!ZFT  are 

i»  -  L"c..t,=  t,,ij  -  .iejoA',-  .L^cS.0.i 

-  V,.,.  'A-5» 

where  H  has  been  defined  in  paragraph  (3. 2). 

II)  Replacing  D  with  an  "Average  Noise"  Matrix,  and  Refining 


If  the  noise  is  white,  but  its  standard  deviation  varies  from  point  to  point 
(or  block  to  block)  along  a  parallel,  a  diagonal  D’  matrix  could  be  chosen  \\  ith  all 
elements  corresponding  to  the  nth  data  row  equal  to  the  average  of  the  variances 
of  the  individual  data  in  that  row; 


=  ff,  = 


1 


MC~> 

Nc  i  -  c 


nl 


(A.  6) 


This  could  be  regarded  as  a  reasonable  approach  in  cases  where  the  noise  is  too 
great  to  be  neglected  without  serious  loss  of  accuracy.  Whichever  way  the  es¬ 
timator  F  is  obtained,  it  can  always  be  "refined"  (if  suboptimal)  to  make  it  more 
accurate.  One  way  of  doing  this  is  to  apply  "steepest  descent”,  where  the  var¬ 
iances  of  the  errors  (diagonal  elements  of  E)  are  reduced  simultaneously  along 
the  "line"  of  search  defined  by  the  present  F  and  the  matrix  gradient  of  the 

trace  of  E  ,  -rk(  E)  ,  to  obtain  the  "improved"  estimator  F«  : 

^  F 


Fx7  =  F7  +  [Cdd  F7  -  C.\  1  K 

-  F  7  +  l  C  tx  F  T  +  D  F T  -  CsT. )  K 


(A.  7 1 


K  is  diagonal,  and 
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fa  =  cMr  ~ 
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£  t 


f” 


+  Dfn 


-  c 


n 

§ 1 


(where  f°  and  c.n£  are  the  nth  columns  of  Fr  and  C,:  ,  respectively!.  The 
element  km  constitutes  the  optimal  "step  length"  for  fj  .  Expression  (A.  7!  re¬ 
quires  calculating  CIt  F  and  f_r  Ctt  f_r.  This  can  be  done  using  the  procedure 
explained  in  the  first  part  of  this  Appendix,  if  those  products  are  not  already 
available.  The  whole  procedure  can  be  iterated. 


